Spatial block-level pixel activity extraction optimization leveraging motion vectors

ABSTRACT

Systems, apparatuses, and methods for implementing spatial block-level pixel activity extraction optimization leveraging motion vectors are disclosed. Control logic coupled to an encoder generates block-level pixel activity metrics for a new frame based on the previously calculated block-level pixel activity data from a reference frame. A cost is calculated for each block of a new frame with respect to a corresponding block of the reference frame. If the cost is less than a first threshold, then the control logic generates an estimate of a pixel activity metric for the block which is equal to a previously calculated pixel activity metric for a corresponding block of the reference frame. If the cost is greater than the first threshold but less than a second threshold, an estimate of the pixel activity metric is generated by extrapolating from the previously calculated pixel activity metric.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.16/147,172, entitled “SPATIAL BLOCK-LEVEL PIXEL ACTIVITY EXTRACTIONOPTIMIZATION LEVERAGING MOTION VECTORS”, filed Sep. 28, 2018, theentirety of which is incorporated herein by reference.

BACKGROUND Description of the Related Art

Various applications perform encoding and decoding of images or videocontent. For example, video transcoding, desktop sharing, cloud gamingand gaming spectatorship are some of the applications which includesupport for encoding and decoding of content. Pixel activitycalculations inside a block are commonly performed in different videoprocessing and analysis algorithms. For example, block-level pixelactivity can be used to determine texture types in an image for someapplications. Examples of blocks include a coding tree block (CTB) foruse with the high efficiency video coding (HEVC) standard or amacroblock for use with the H.264 standard. Other types of blocks foruse with other types of standards are also possible.

Different methods can be used to calculate pixel activities for theblocks of an image or video frame. For example, in one implementation, agray-level co-occurrence matrix (GLCM) is calculated for each block ofthe frame. GLCM data shows how often different variations of pixelbrightness occur in a block. GLCM data can be calculated for pixels withdistance d and angle theta. In another implementation, a two-dimensional(2D) spatial mean gradient is calculated for a given block. Thisgradient can capture vertical and horizontal edges. In a furtherimplementation, a wavelet transform, or other types of transforms, isused to measure an activity parameter for a given block. Accordingly, asused herein, the terms “pixel activity metric”, “pixel activity”, or“pixel activities” are defined as a GLCM, a gradient, a wavelettransform, or other metric or summary statistic for the block. It isnoted that the terms “pixel activity” and “block-level pixel activity”can be used interchangeably herein. In some cases, the pixel activity isrepresented using a matrix. In other cases, the pixel activity isrepresented using one or more values.

Calculating block-level pixel activity for each block of every frame ofa video can be a time-consuming and/or an unnecessary power consumingoperation depending on the type of activity metric that is going to beused. For example, calculating pixel co-occurrence from scratch requiresevaluating pixel pairs for all pixels inside a block. This type ofcalculation requires processing the entire frame. While it is possibleto perform parallel calculations in some cases, the processing demand isstill high.

BRIEF DESCRIPTION OF THE DRAWINGS

The advantages of the methods and mechanisms described herein may bebetter understood by referring to the following description inconjunction with the accompanying drawings, in which:

FIG. 1 is a block diagram of one implementation of a system for encodingand decoding content.

FIG. 2 is a block diagram of one implementation of a server.

FIG. 3 is a block diagram of one implementation of a set of motionvectors for a sequence of video frames.

FIG. 4 is a generalized flow diagram illustrating one implementation ofa method for implementing spatial block-level pixel activity extractionoptimization leveraging motion vectors.

FIG. 5 is a generalized flow diagram illustrating another implementationof a method for generating block-level pixel activities for blocks of anew frame.

FIG. 6 is a generalized flow diagram illustrating one implementation ofa method for determining a pixel activity metric generation scheme.

FIG. 7 is a block diagram of one implementation of consecutive frames ofa video sequence and a corresponding motion vector.

FIG. 8 is a block diagram of one implementation of frames with differentblock metric granularity levels.

FIG. 9 is a generalized flow diagram illustrating one implementation ofa method for calculating a pixel activity metric at differentgranularities.

DETAILED DESCRIPTION OF IMPLEMENTATIONS

In the following description, numerous specific details are set forth toprovide a thorough understanding of the methods and mechanisms presentedherein. However, one having ordinary skill in the art should recognizethat the various implementations may be practiced without these specificdetails. In some instances, well-known structures, components, signals,computer program instructions, and techniques have not been shown indetail to avoid obscuring the approaches described herein. It will beappreciated that for simplicity and clarity of illustration, elementsshown in the figures have not necessarily been drawn to scale. Forexample, the dimensions of some of the elements may be exaggeratedrelative to other elements.

Systems, apparatuses, and methods for implementing spatial block-levelpixel activity extraction optimization techniques are disclosed herein.In one implementation, a system includes an encoder and control logiccoupled to the encoder. In one implementation, the control logicleverages the motion estimation data to avoid unnecessary and expensivecalculation of pixel activities. In this implementation, the controllogic calculates an approximate value for the pixel activities. Forexample, in one implementation, the control logic uses the local motionvectors and block-level pixel activities calculated for a referenceframe to generate estimates of the pixel activities for blocks of a newframe. As used herein, the term “reference frame” is defined as a frameof a video stream that is used to define and/or encode one or morefuture frames of the video stream. In one implementation, the controllogic processes a new frame on a block-by-block basis. In oneimplementation, the pixel activities are a gradient calculated for theblock of the new frame. In another implementation, the pixel activitiesinclude a texture analysis of the block of the new frame. In otherimplementations, the pixel activities include any of various summarystatistics or other mathematical quantities (e.g., average, maximumvalue, variance).

In one implementation, the control logic calculates the difference(i.e., sum of differences cost) between a block of a new frame and acorresponding block of a reference frame, wherein the correspondingblock of the reference frame is identified by a motion vector. If thedifference (or “error”) is below a first threshold, then the controllogic copies the pixel activities from the corresponding block of thereference frame rather than recalculating the pixel activities of theblock of the new frame. If the error is greater than or equal to thefirst threshold but less than a second threshold, then the control logicextrapolates from the pixel activities from the corresponding block ofthe reference frame using the motion estimation data to generateestimates of the pixel activities of the block of the new frame.Otherwise, if the error is greater than or equal to the secondthreshold, then the control logic calculates the pixel activities forthe block of the new frame using the conventional approach.

Referring now to FIG. 1, a block diagram of one implementation of asystem 100 for encoding and decoding content is shown. System 100includes server 105, network 110, client 115, and display 120. In otherimplementations, system 100 includes multiple clients connected toserver 105 via network 110, with the multiple clients receiving the samebitstream or different bitstreams generated by server 105. System 100can also include more than one server 105 for generating multiplebitstreams for multiple clients.

In one implementation, system 100 implements encoding and decoding ofvideo content. In various implementations, different applications suchas a video game application, a cloud gaming application, a virtualdesktop infrastructure application, or a screen sharing application areimplemented by system 100. In other implementations, system 100 executesother types of applications. In one implementation, server 105 rendersvideo or image frames, generates pixel activity metrics for blocks ofthe frames, encodes the frames into a bitstream, and then conveys theencoded bitstream to client 115 via network 110. Client 115 decodes theencoded bitstream and generate video or image frames to drive to display120 or to a display compositor.

In one implementation, server 105 generates estimates of pixel activitymetrics for blocks of the frames rather than calculating pixel activitymetrics from scratch. For example, a block (i,j) in a frame f is denotedby block (f,i,j). In one implementation, the pixel activity metric isavailable for block (fl,i,j) in the reference frame (fl). In oneimplementation, the motion vectors are calculated for all blocks insidethe current frame f. One method of calculating motion estimation data isto determine the sum of absolute differences (SAD) between pixel samplesof a reference block in the reference image, and another candidate blockin the current image. Candidates for the current block are multiplelocations in a “search area”—each of which has a horizontal and verticaldisplacements of dx and dy. Motion estimation finds the vector (dx,dy)that has the smallest SAD in the search area. In this case the SAD isreferred to as a “cost”. Other cost functions are possible. Thereference image can be different from the current image (temporal) orthe reference image can be the same image (spatial). The motion vectorMV(dx, dy, i, j, C) represents the motion vector with dx and dydisplacement for block i and j and with cost C. The pixel activitymetric of Block(i, j, f) is calculated based on the pixel activitymetric of block (i-dx, j-dy, fl) where fl is the reference frame withmotion vector (dx, dy, i, j, C).

In one implementation, if C<Threshold1, then the two blocks are similar,and the pixel activity metric of the new block is estimated as the pixelactivity metric of the related block in the reference frame. IfC≥Threshold1 and C<Threshold2, then the two blocks are similar with somedifferences. In this case, depending on the requirements, a transferfunction is used to map the cost C to a correction factor. Thecorrection factor is used to update the pixel activity metric of thereference block to generate an estimate of pixel activity metric for thecurrent block. Alternatively, a machine-learning solution is used togather some information including the motion estimation cost and thereference-block pixel activity metric to calculate the pixel activitymetric of the new block. If C>Threshold2, then the two blocks aredifferent, and the pixel activity metric for that block is calculatedfrom scratch.

Network 110 is representative of any type of network or combination ofnetworks, including wireless connection, direct local area network(LAN), metropolitan area network (MAN), wide area network (WAN), anIntranet, the Internet, a cable network, a packet-switched network, afiber-optic network, a router, storage area network, or other type ofnetwork. Examples of LANs include Ethernet networks, Fiber DistributedData Interface (FDDI) networks, and token ring networks. In variousimplementations, network 110 includes remote direct memory access (RDMA)hardware and/or software, transmission control protocol/internetprotocol (TCP/IP) hardware and/or software, router, repeaters, switches,grids, and/or other components.

Server 105 includes any combination of software and/or hardware forrendering video/image frames and encoding the frames into a bitstream.In one implementation, server 105 includes one or more softwareapplications executing on one or more processors of one or more servers.Server 105 also includes network communication capabilities, one or moreinput/output devices, and/or other components. The processor(s) ofserver 105 include any number and type (e.g., graphics processing units(GPUs), central processing units (CPUs), digital signal processors(DSPs), field programmable gate arrays (FPGAs), application specificintegrated circuits (ASICs)) of processors. The processor(s) are coupledto one or more memory devices storing program instructions executable bythe processor(s). Similarly, client 115 includes any combination ofsoftware and/or hardware for decoding a bitstream and driving frames todisplay 120. In one implementation, client 115 includes one or moresoftware applications executing on one or more processors of one or morecomputing devices. In various implementations, client 115 is a computingdevice, game console, mobile device, streaming media player, or othertype of device.

Turning now to FIG. 2, a block diagram of one implementation of thesoftware components of a server 200 for encoding frames of a video isshown. It is noted that in other implementations, server 200 includesother components and/or is arranged in other suitable manners than isshown in FIG. 2. A new frame 205 of a video is received by server 200 oninterface 208 and coupled to motion vector unit 210, control logic 220,and encoder 230. Depending on the implementation, interface 208 is a businterface, a memory interface, or an interconnect to a communicationfabric and/or other type(s) of device(s). Each of motion vector unit210, control logic 220, and encoder 230 is implemented using anysuitable combination of hardware and/or software. Motion vector unit 210generates motion vectors 215 for the blocks of new frame 205 based on acomparison of new frame 205 to reference frame 207. In oneimplementation, reference frame 207 is stored in memory 240. Memory 240is representative of any number and type of memory or cache device(s)for storing data and/or instructions associated with the encodingprocess.

Motion vectors 215 are provided to control logic 220 and encoder 230.Control logic 220 generates estimated pixel activities 225 fromreference frame calculated pixel activities 222 based on motion vectors215. For example, in one implementation, control logic 220 processes newframe 205 on a block-by-block basis. For each block, control logic 220retrieves the calculated sum of differences cost between the block and acorresponding block in reference frame 207 identified by a correspondingmotion vector 215. If the sum of differences cost for the block is lessthan a first threshold, then control logic 220 generates an estimatedpixel activity 225 for the block as the previously calculated pixelactivity 222 for the corresponding block of reference frame 207.

If the sum of differences cost for the block is greater than or equal tothe first threshold but less than a second threshold, then control logic220 generates the estimated pixel activity 225 for the block byextrapolating from the pixel activity 222 of a corresponding block inthe reference frame 207 based on the motion vector 215 of the block. Forexample, in one implementation, control logic 220 uses a transferfunction to map the sum of differences cost to a correction factor whichis used to update the pixel activity 222 of the corresponding block fromreference frame 207. This updated pixel activity from the correspondingblock from reference frame 207 is then used as the estimated pixelactivity 225 for the block of new frame 205. In one implementation, thetransfer function is applied using a lookup table. If the sum ofdifferences cost for the block is greater than or equal to the secondthreshold, then control logic 220 generates the pixel activity for theblock from scratch using conventional techniques.

Referring now to FIG. 3, a block diagram of one implementation of a setof motion vectors 315A-C for a sequence of video frames 305A-D is shown.Frames 305A-D represent consecutive frames of a video sequence. Box 310Arepresents an individual block of pixels within frame 305A. Box 310A canalso be referred to as a macroblock. The arrow 315A represents the knownmotion of the imagery within box 310A as the video sequence moves fromframe 305A to 305B. The known motion illustrated by arrow 315A can bedefined by a motion vector. It is noted that although motion vectors315A-C point in the direction of motion of box 310 in subsequent frames,in another implementation, a motion vector can be defined to point in adirection opposite to the motion of the imagery. For example, in somecompression standards, a motion vector associated with a macroblockpoints to the source of that block in the reference frame. The referenceframe can be forward or backward in time. It is also noted that motionvectors can represent entropy in some implementations.

In one implementation, boxes 310B-D can be tracked in subsequent framesusing motion vectors 315A-C. For example, the motion vector 315Aindicates the change in position of box 310B in frame 305B as comparedto box 310A in frame 305A. Similarly, motion vector 315B indicates thechange in location of box 310C in frame 310C as compared to box 310B inframe 305B. Also, motion vector 315C indicates the change in location ofbox 310D in frame 310D as compared to box 310C in frame 305C. In anotherimplementation, a motion vector is defined to track the reverse motionof a block from a given frame back to the previous frame.

In one implementation, when an encoder needs to generate various pixelactivity metrics for the blocks of a new frame, the encoder generatesestimates of the pixel activity metrics based on the previouslycalculated pixel activity metrics of the corresponding blocks in aprevious frame. In one implementation, the encoder uses the motionvectors 315A-C to identify which corresponding block in the previousframe matches a given block in a new frame. The encoder then uses thepreviously calculated pixel activity metric for the identified block inthe previous frame to help in generating the estimate of the pixelactivity metric of the block in the new frame. In one implementation,the encoder uses the previously calculated pixel activity metric,without modifications, as the estimate of the pixel activity metric ofthe block in the new frame. In another implementation, the encoderextrapolates from the previously calculated pixel activity metric byusing a transfer function to map the sum of differences cost or anyother cost between the blocks to a correction factor. The correctionfactor is then applied to the previously calculated pixel activity togenerate the estimate of the pixel activity metric for the block in thenew frame.

Turning now to FIG. 4, one implementation of a method 400 forimplementing spatial block-level pixel activity extraction optimizationleveraging motion vectors is shown. For purposes of discussion, thesteps in this implementation and those of FIG. 5-6 are shown insequential order. However, it is noted that in various implementationsof the described methods, one or more of the elements described areperformed concurrently, in a different order than shown, or are omittedentirely. Other additional elements are also performed as desired. Anyof the various systems or apparatuses described herein are configured toimplement method 400.

A motion vector unit calculates a sum of absolute differences cost forblocks of a new frame of a video stream in comparison to correspondingblocks of one or more reference frames (block 405). In oneimplementation, the cost is a sum of absolute differences cost. In otherimplementations, the cost is other types of costs or errors calculatedfrom the blocks of the new frame and corresponding blocks of one or morereference frames. In one implementation, the motion vector unit is partof an encoder, with the encoder also including control logic. In oneimplementation, the encoder is implemented on a system with at least oneprocessor coupled to at least one memory device. In one implementation,the encoder is implemented on a server which is part of a cloudcomputing environment. The motion vector unit generates motion vectorsfor blocks of the new frame based on comparisons to corresponding blocksof the reference frame(s) (block 410). Encoder control logic receivesthe costs and motion vectors for the blocks of the new frame (block415).

The control logic also receives the previously calculated block-levelpixel activities for blocks of the reference frame(s) (block 420). Then,the control logic generates estimates of the block-level pixelactivities for blocks of the new frame based on the costs and motionvectors for the blocks of the new frame and based on the previouslycalculated block-level pixel activities for blocks of the referenceframe(s) (block 425). After block 425, method 400 ends. One example ofan implementation for performing block 425 is described in furtherdetail below in the discussion of FIG. 5. In various implementations,the block-level pixel activities for blocks of the new frame to helpwith the encoding of the new frame, for classifying the new frame,and/or for performing further analysis of the new frame. In oneimplementation, the block-level pixel activities for blocks of the newframe are used to classify the new frame into one of a plurality ofcategories, to identify what types of objects are present in the newframe, and/or to perform other actions.

Turning now to FIG. 5, one implementation of a method 500 for generatingblock-level pixel activities for blocks of a new frame is shown. Encodercontrol logic receives a new frame to encode (block 505). The encodercontrol logic is implemented using any suitable combination of hardwareand/or software. The control logic analyzes the new frame on ablock-by-block basis (block 510). It is assumed for the purposes of thisdiscussion that motion estimation data (e.g., motion vectors) and costshave already been calculated for the blocks of the new frame when thecontrol logic receives the new frame to encode. In implementations wherethe motion estimation data or costs have not been calculated, thecontrol logic calculates these values in response to receiving the newframe to encode. Any type of costs can be calculated depending on theimplementation.

For each block of the new frame, if the cost of the block is less than afirst threshold (conditional block 515, “yes” leg), then the controllogic generates an estimate of pixel activities for the block as beingequal to previously calculated pixel activities of a corresponding blockin one or more reference frames (block 520). If the cost of the block isgreater than or equal to the first threshold (conditional block 515,“no” leg), then the control logic determines if the cost of the block isless than a second threshold (conditional block 525). If the cost of theblock is less than the second threshold (conditional block 525, “yes”leg), then the control logic generates an estimate of pixel activitiesfor the block by extrapolating from the previously calculated pixelactivities of the corresponding block in the reference frame(s) based onmotion estimation data of the block (block 530). If the cost of theblock is greater than or equal to the second threshold (conditionalblock 525, “no” leg), then the control logic calculates pixel activitiesfor the block independently of the previously calculated pixelactivities of the corresponding block in the reference frame(s) andmotion estimation data of the block (block 535). In other words, thecontrol logic calculates pixel activities for the block from scratch, inthe conventional manner in block 535. After blocks 520, 530, and 535,method 500 end.

Turning now to FIG. 6, one implementation of a method 600 fordetermining a pixel activity metric generation scheme is shown. Anencoder generates a histogram of costs of blocks of a new frame incomparison to corresponding blocks of one or more reference frames(block 605). If at least a given percentage of blocks have a costgreater than a threshold (conditional block 610, “yes” leg), then theencoder calculates pixel activity metrics for blocks of the new frame ina conventional manner (block 615). The values of the given percentageand the threshold can vary according to the implementation. If less thanthe given percentage of blocks have a cost greater than the threshold(conditional block 610, “no” leg), then the encoder generates estimatesof pixel activity metrics for blocks of the new frame based onpreviously calculated pixel activity metrics for blocks of the referenceframe(s), costs, and motion vectors of blocks of the new frame (block620). One example of performing block 620 is described in the discussionassociated with FIG. 5. After blocks 615 and 620, method 600 ends.

Referring now to FIG. 7, a block diagram of one implementation ofconsecutive frames of a video stream and a corresponding motion vectoris shown. In some implement ations, a pixel activity metric may not beavailable for an arbitrary reference block. For example, activities maybe available only at macroblock granularity, which could mean that onlyreferenced blocks at locations (16*i, 16*j) are available, where i and jare integers. It is noted that the example of a block size of 16 pixelsby 16 pixels is only indicative of one implementation. In otherimplementations, macroblocks or coding units can have other sizes. Forother motion vectors, an estimate of the value of the pixel activitymetric can be estimated based on the blocks from which the pixels weresourced if the error is acceptable. Pixels could be sourced from oneblock in a specific case, while in other cases, pixels could be sourcedfrom up to four blocks. In the example shown in FIG. 7, pixels aresourced from two blocks.

Two successive video frames 705 and 710 at times ‘t-2’ and ‘t-1’,respectively are shown in FIG. 7. Within video frame 705 at time ‘t-2’,block (0, 2, t-2) has pixel activity metric A_(0,2) and block (1, 2,t-2) has pixel activity metric A_(1,2). Within video frame 710 at time‘t-1’, block (1, 2, t-1) has pixel activity metric N_(1,2). In oneimplementation, if the cost of block (1, 2, t-1) compared to thematching pixels of frame 705 is less than a first threshold, then motionvector 715 match is considered to be a close match and therefore thefollowing estimation method can be used. It is assumed for the purposesof this discussion that block (0, 2, t-2) contributes w₀ percentage ofits pixels to block (1, 2, t-1) and that block (1, 2, t-2) contributesw₁ percentage of its pixels to block (1, 2, t-1). The possible error isinversely proportional to the maximum of w₀ and w₁. Based on thispossible error, the blocks which require calculation of activity ratherthan just generating an estimate can be determined. In oneimplementation, an estimate could be the higher of A_(0,2) or A_(1,2) oran interpolation such as N_(1,2)=w₀*A_(0,2)+w₁*A_(1,2), where w₀+w₁=1.

In some cases, a candidate estimated area can be calculated occasionallyregardless of the possible error. For example, a candidate estimatedarea can be calculated when a randomly generated value x is greater thana random threshold. This is to ensure that the estimated pixel activitymetric does not drift too far from the actual signal. In one scenario,if the random threshold value is set to 0.9, this would mean that 90% ofthe time the actual calculation of the pixel activity metric is skipped.The value of the random threshold can be reduced if more precision isneeded.

Turning now to FIG. 8, a block diagram of one implementation of frameswith different block metric granularity levels is shown. Many pixelactivity metrics, such as discrete gradients, are cumulative over anarea. In one implementation, these pixel activity metrics are structuredfor fast estimation by storing the metric at a finer granularity, bystoring summed area tables, or using other acceleration data structuresallowing fast calculation of metrics that have specific mathematicalproperties (e.g., homogeneous, additive). Estimates are calculatedquickly by aggregating appropriate values from these data structures.For example, the pixel activity metric for area 830 of low metricgranularity frame 810 is estimated by adding together the pixel activitymetrics for areas 815A-D of high metric granularity frame 805. Thisestimate is deemed acceptable as the extra area that is included byareas 815B and 815D is considered insignificant (i.e., error iscalculated to be acceptable). Similarly, the area missed immediatelynext to areas 815A and 815C is also considered insignificant. Ifadditional precision is needed, corrections are made by calculating andadjusting for the metric's contribution in extra needed and unneededareas.

If the estimate for high metric granularity frame 805 is not deemedsufficiently precise, then the estimate can be corrected. For example,in one implementation, correction 825A, shown in expanded view 820, iscalculated for the contribution of the metric in the correspondinglymarked area. Another correction 825B is calculated for the contributionof the metric in the correspondingly marked area. Then, the pixelactivity metric for the reference block is estimated by adding togetherthe metrics for areas 815A-D plus correction 825A and minus correction825B. In this example, the corrections that are needed are due to errorscaused by a horizontal offset. With different alignments, correctionsmay be needed to handle vertical offset errors; or for other alignments,corrections may be needed to adjust for both vertical and horizontaloffsets.

Referring now to FIG. 9, one implementation of a method 900 forcalculating a pixel activity metric at different granularities is shown.An encoder calculates pixel activity metrics at a first granularity forblocks of a reference frame (block 905). The encoder generates estimatesfor pixel activity metrics at a second granularity for blocks of a newframe, wherein the first granularity is a finer granularity than thesecond granularity, and wherein the estimates are generated based on thepixel activity metrics of blocks of the reference frame (block 910). Aspart of generating estimates for pixel activity metrics for blocks ofthe new frame at the second granularity, the encoder identifies, basedon a motion vector, a plurality of blocks in the reference frame whichcorrespond to each block of the new frame (block 915). Then, the encodergenerates a cumulative estimate for the pixel activity metrics bysumming the pixel activity metrics for the plurality of blocks from thereference frame for each block of the new frame (block 920). After block920, method 900 ends.

In various implementations, program instructions of a softwareapplication are used to implement the methods and/or mechanismsdescribed herein. For example, program instructions executable by ageneral or special purpose processor are contemplated. In variousimplementations, such program instructions can be represented by a highlevel programming language. In other implementations, the programinstructions can be compiled from a high level programming language to abinary, intermediate, or other form. Alternatively, program instructionscan be written that describe the behavior or design of hardware. Suchprogram instructions can be represented by a high-level programminglanguage, such as C. Alternatively, a hardware design language (HDL)such as Verilog can be used. In various implementations, the programinstructions are stored on any of a variety of non-transitory computerreadable storage mediums. The storage medium is accessible by acomputing system during use to provide the program instructions to thecomputing system for program execution. Generally speaking, such acomputing system includes at least one or more memories and one or moreprocessors configured to execute program instructions.

It should be emphasized that the above-described implementations areonly non-limiting examples of implementations. Numerous variations andmodifications will become apparent to those skilled in the art once theabove disclosure is fully appreciated. It is intended that the followingclaims be interpreted to embrace all such variations and modifications.

What is claimed is:
 1. A system comprising: an interface configured toreceive a new video frame of a video stream; control logic coupled tothe interface, wherein the control logic is configured to generateestimates of block-level pixel activity for the new video frame basedon: motion estimation data for the new video frame; and previouslycalculated block-level pixel activity from a reference video frame; anencoder configured to generate an encoded video frame based on theestimates, wherein the encoded video frame represents the new videoframe.
 2. The system as recited in claim 1, wherein for each block ofthe new video frame, the control logic is configured to: compare, to afirst threshold, a cost of the block of the new video frame with respectto a corresponding block of the reference video frame; and generate anestimate of pixel activity for the block, wherein the estimate is equalto previously calculated pixel activity of a corresponding block in thereference video frame if the cost is less than the first threshold. 3.The system as recited in claim 2, wherein for each block of the newvideo frame, the control logic is further configured to: compare thecost of the block to a second threshold; responsive to determining thatthe cost is greater than or equal to the first threshold and less thanthe second threshold: map the cost to a correction factor using atransfer function; and generate an estimate of the pixel activity forthe block by applying the correction factor to the previously calculatedpixel activity of the corresponding block in the reference video frame.4. The system as recited in claim 3, wherein the control logic isfurther configured to calculate the pixel activity for the blockindependently of the previously calculated pixel activity if the cost isgreater than or equal to the second threshold.
 5. The system as recitedin claim 1, wherein the control logic is configured to classify the newvideo frame based on the estimates, and wherein the block-level pixelactivity represents gradients, co-occurrence matrices, or any othermetric where results of a mathematical operation on a first pixel valueand a second pixel value at a defined relative displacement to the firstpixel value are aggregated on blocks of the new video frame.
 6. Thesystem as recited in claim 1, wherein the previously calculatedblock-level pixel activity is at a first granularity, and wherein theencoder is configured to generate estimates for block-level pixelactivity for the new video frame at a second granularity, wherein thefirst granularity is a finer granularity than the second granularity. 7.The system as recited in claim 6, wherein the encoder is configured togenerate cumulative estimates for the block-level pixel activity foreach block of the new video frame by summing pixel activities for aplurality of blocks from the reference video frame.
 8. A methodcomprising: receiving, by a server, a new video frame of a video stream;and generating, by control logic, estimates of block-level pixelactivity for a new video frame of the video stream based on: motionestimation data for the new video frame; and previously calculatedblock-level pixel activity from a reference video frame; and generating,by an encoder, an encoded video frame based on the estimates, whereinthe encoded video frame represents the new video frame.
 9. The method asrecited in claim 8, further comprising: comparing, to a first threshold,a cost of a block of the new video frame with respect to a correspondingblock of the reference video frame; and generating an estimate of pixelactivity for the block, wherein the estimate is equal to previouslycalculated pixel activity of a corresponding block in the referencevideo frame if the cost is less than the first threshold.
 10. The methodas recited in claim 9, further comprising: comparing the cost of theblock to a second threshold; responsive to determining that the cost isgreater than or equal to the first threshold and less than the secondthreshold: mapping the cost to a correction factor using a transferfunction; and generating an estimate of the pixel activity for the blockby applying the correction factor to the previously calculated pixelactivity of the corresponding block in the reference video frame. 11.The method as recited in claim 10, further comprising calculating thepixel activity for the block independently of the previously calculatedpixel activity if the cost is greater than or equal to the secondthreshold.
 12. The method as recited in claim 8, further comprisingclassifying the new video frame based on the estimates, wherein theblock-level pixel activity represents gradients, co-occurrence matrices,or any other metric where results of a mathematical operation on a firstpixel value and a second pixel value at a defined relative displacementto the first pixel value are aggregated on blocks of the new videoframe.
 13. The method as recited in claim 8, wherein the previouslycalculated block-level pixel activity is at a first granularity, andwherein the method further comprising generating estimates forblock-level pixel activity for the new video frame at a secondgranularity, wherein the first granularity is a finer granularity thanthe second granularity.
 14. The method as recited in claim 13, furthercomprising generating cumulative estimates for the block-level pixelactivity for each block of the new video frame by summing pixelactivities for a plurality of blocks from the reference video frame. 15.An apparatus comprising: a memory; an encoder coupled to the memory; andcontrol logic coupled to the encoder, wherein the control logic isconfigured to generate estimates of block-level pixel activity for a newvideo frame based on: motion estimation data for the new video frame;and previously calculated block-level pixel activity from a referencevideo frame stored in the memory; wherein the encoder is configured togenerate an encoded video frame based on the estimates, wherein theencoded video frame represents the new video frame.
 16. The apparatus asrecited in claim 15, wherein for each block of the new video frame, thecontrol logic is configured to: compare, to a first threshold, a cost ofthe block of the new video frame with respect to a corresponding blockof the reference video frame; and generate an estimate of pixel activityfor the block, wherein the estimate is equal to previously calculatedpixel activity of a corresponding block in the reference video frame ifthe cost is less than the first threshold.
 17. The apparatus as recitedin claim 16, wherein for each block of the new video frame, the controllogic is further configured to: compare the cost of the block to asecond threshold; responsive to determining that the cost is greaterthan or equal to the first threshold and less than the second threshold:map the cost to a correction factor using a transfer function; andgenerate an estimate of the pixel activity for the block by applying thecorrection factor to the previously calculated pixel activity of thecorresponding block in the reference video frame.
 18. The apparatus asrecited in claim 17, wherein the control logic is further configured tocalculate the pixel activity for the block independently of thepreviously calculated pixel activity if the cost is greater than orequal to the second threshold.
 19. The apparatus as recited in claim 15,wherein the control logic is configured to classify the new video framebased on the estimates, and wherein the block-level pixel activityrepresents gradients, co-occurrence matrices, or any other metric whereresults of a mathematical operation on a first pixel value and a secondpixel value at a defined relative displacement to the first pixel valueare aggregated on blocks of the new video frame.
 20. The apparatus asrecited in claim 15, wherein the previously calculated block-level pixelactivity is at a first granularity, and wherein encoder is configured togenerate estimates for block-level pixel activity for the new videoframe at a second granularity, wherein the first granularity is a finergranularity than the second granularity.